随着计算机视觉技术的进步,根据其功能对图像进行分类的需求已成为一项巨大的任务和必要性。在此项目中,我们提出了2种模型,即使用ORB和SVM的特征提取和分类,第二个是使用CNN体系结构。该项目的最终结果是了解特征提取和图像分类背后的概念。训练有素的CNN模型还将用于将其转换为用于Android开发的TFLITE格式。
translated by 谷歌翻译
在这个项目中,我们提出了一个CNN架构来检测异常和可疑活动。为该项目选择的活动正在公共场所开展,跳跃和踢球,并在公共场所携带枪支,蝙蝠和刀。通过训练有素的模型,我们将其与Yolo,VGG16,VGG19等先前的模型进行了比较。然后实现训练有素的模型进行实时检测,并使用。训练有素的.H5模型的TFLITE格式以构建Android分类。
translated by 谷歌翻译
拟议的购物助理模型SANIP将帮助盲人检测手持有的物体,并从检测到的对象中获取信息的视频反馈。提出的模型由三个Python模型组成,即自定义对象检测,文本检测和条形码检测。为了检测手持对象,我们创建了自己的自定义数据集,该数据集包括Parle-G,Tide和Lays等日常商品。除此之外,我们还收集了购物车和出口标志的图像,因为对于任何人来说,使用购物车都至关重要,并且在紧急情况下还要注意出口标志。对于其他2个模型,提出的文本和条形码信息将从文本转换为语音,并传达给盲人。该模型用于检测经过训练并成功地检测和识别所需输出的对象,其精度和精确度良好。
translated by 谷歌翻译
Voice assistants are deployed widely and provide useful functionality. However, recent work has shown that commercial systems like Amazon Alexa and Google Home are vulnerable to voice-based confusion attacks that exploit design issues. We propose a systems-oriented defense against this class of attacks and demonstrate its functionality for Amazon Alexa. We ensure that only the skills a user intends execute in response to voice commands. Our key insight is that we can interpret a user's intentions by analyzing their activity on counterpart systems of the web and smartphones. For example, the Lyft ride-sharing Alexa skill has an Android app and a website. Our work shows how information from counterpart apps can help reduce dis-ambiguities in the skill invocation process. We build SkilIFence, a browser extension that existing voice assistant users can install to ensure that only legitimate skills run in response to their commands. Using real user data from MTurk (N = 116) and experimental trials involving synthetic and organic speech, we show that SkillFence provides a balance between usability and security by securing 90.83% of skills that a user will need with a False acceptance rate of 19.83%.
translated by 谷歌翻译
Reinforcement learning is a machine learning approach based on behavioral psychology. It is focused on learning agents that can acquire knowledge and learn to carry out new tasks by interacting with the environment. However, a problem occurs when reinforcement learning is used in critical contexts where the users of the system need to have more information and reliability for the actions executed by an agent. In this regard, explainable reinforcement learning seeks to provide to an agent in training with methods in order to explain its behavior in such a way that users with no experience in machine learning could understand the agent's behavior. One of these is the memory-based explainable reinforcement learning method that is used to compute probabilities of success for each state-action pair using an episodic memory. In this work, we propose to make use of the memory-based explainable reinforcement learning method in a hierarchical environment composed of sub-tasks that need to be first addressed to solve a more complex task. The end goal is to verify if it is possible to provide to the agent the ability to explain its actions in the global task as well as in the sub-tasks. The results obtained showed that it is possible to use the memory-based method in hierarchical environments with high-level tasks and compute the probabilities of success to be used as a basis for explaining the agent's behavior.
translated by 谷歌翻译
Content scanning systems employ perceptual hashing algorithms to scan user content for illegal material, such as child pornography or terrorist recruitment flyers. Perceptual hashing algorithms help determine whether two images are visually similar while preserving the privacy of the input images. Several efforts from industry and academia propose to conduct content scanning on client devices such as smartphones due to the impending roll out of end-to-end encryption that will make server-side content scanning difficult. However, these proposals have met with strong criticism because of the potential for the technology to be misused and re-purposed. Our work informs this conversation by experimentally characterizing the potential for one type of misuse -- attackers manipulating the content scanning system to perform physical surveillance on target locations. Our contributions are threefold: (1) we offer a definition of physical surveillance in the context of client-side image scanning systems; (2) we experimentally characterize this risk and create a surveillance algorithm that achieves physical surveillance rates of >40% by poisoning 5% of the perceptual hash database; (3) we experimentally study the trade-off between the robustness of client-side image scanning systems and surveillance, showing that more robust detection of illegal material leads to increased potential for physical surveillance.
translated by 谷歌翻译
In recent years, unmanned aerial vehicle (UAV) related technology has expanded knowledge in the area, bringing to light new problems and challenges that require solutions. Furthermore, because the technology allows processes usually carried out by people to be automated, it is in great demand in industrial sectors. The automation of these vehicles has been addressed in the literature, applying different machine learning strategies. Reinforcement learning (RL) is an automation framework that is frequently used to train autonomous agents. RL is a machine learning paradigm wherein an agent interacts with an environment to solve a given task. However, learning autonomously can be time consuming, computationally expensive, and may not be practical in highly-complex scenarios. Interactive reinforcement learning allows an external trainer to provide advice to an agent while it is learning a task. In this study, we set out to teach an RL agent to control a drone using reward-shaping and policy-shaping techniques simultaneously. Two simulated scenarios were proposed for the training; one without obstacles and one with obstacles. We also studied the influence of each technique. The results show that an agent trained simultaneously with both techniques obtains a lower reward than an agent trained using only a policy-based approach. Nevertheless, the agent achieves lower execution times and less dispersion during training.
translated by 谷歌翻译
Differentially Private Stochastic Gradient Descent (DP-SGD) is a key method for applying privacy in the training of deep learning models. This applies isotropic Gaussian noise to gradients during training, which can perturb these gradients in any direction, damaging utility. Metric DP, however, can provide alternative mechanisms based on arbitrary metrics that might be more suitable. In this paper we apply \textit{directional privacy}, via a mechanism based on the von Mises-Fisher (VMF) distribution, to perturb gradients in terms of \textit{angular distance} so that gradient direction is broadly preserved. We show that this provides $\epsilon d$-privacy for deep learning training, rather than the $(\epsilon, \delta)$-privacy of the Gaussian mechanism; and that experimentally, on key datasets, the VMF mechanism can outperform the Gaussian in the utility-privacy trade-off.
translated by 谷歌翻译
尽管公平感知的机器学习算法一直在受到越来越多的关注,但重点一直放在集中式的机器学习上,而分散的方法却没有被解散。联合学习是机器学习的一种分散形式,客户使用服务器训练本地模型,以汇总它们以获得共享的全局模型。客户之间的数据异质性是联邦学习的共同特征,这可能会诱导或加剧对由种族或性别等敏感属性定义的无私人群体的歧视。在这项工作中,我们提出了公平命运:一种新颖的公平联合学习算法,旨在实现群体公平,同时通过公平意识的聚合方法维持高效用,该方法通过考虑客户的公平性来计算全球模型。为此,通过使用动量术语来估算公平模型更新来计算全局模型更新,该术语有助于克服嘈杂的非直接梯度的振荡。据我们所知,这是机器学习中的第一种方法,旨在使用公平的动力估算来实现公平性。四个现实世界数据集的实验结果表明,在不同级别的数据异质性下,公平命运显着优于最先进的联邦学习算法。
translated by 谷歌翻译
用于训练机器学习算法的现实世界图像通常是非结构化且不一致的。分析和标记这些图像的过程可能是昂贵的,并且容易出错(也有差距和法律难题)。但是,正如我们在本文中所证明的那样,与现实世界中无法区分的准确图形图像的潜力在机器学习范式中具有许多好处。一个这样的例子是来自广播服务(电视和其他流媒体来源)的足球数据。足球比赛通常是从多个来源(相机和电话)和决议中记录的,更不用说,视觉细节和其他人工制品(例如模糊,风化和照明条件)的遮挡,使其难以准确识别功能。我们演示了一种能够使用生成的标记和结构化图像来克服这些局限性的方法。生成的图像能够模拟多种视图和条件(包括噪声和模糊),这些视图和条件可能只会在现实世界中偶尔出现,并且使机器学习算法难以“应对”实际数据中的这些不可预见的问题。这种方法使我们能够快速训练并准备一种可靠的解决方案,该解决方案可从现实世界足球比赛来源中准确提取功能(例如,空间位置,球场位置,球员位置和摄像头FOV),以用于分析目的。
translated by 谷歌翻译